Online personalized recommendation services are generally hosted in the cloud where users query the cloud-based model to receive recommended input such as merchandise of interest or news feed. State-of-the-art recommendation models rely on sparse and dense features to represent users' profile information and the items they interact with. Although sparse features account for 99% of the total model size, there was not enough attention paid to the potential information leakage through sparse features. These sparse features are employed to track users' behavior, e.g., their click history, object interactions, etc., potentially carrying each user's private information. Sparse features are represented as learned embedding vectors that are stored in large tables, and personalized recommendation is performed by using a specific user's sparse feature to index through the tables. Even with recently-proposed methods that hides the computation happening in the cloud, an attacker in the cloud may be able to still track the access patterns to the embedding tables. This paper explores the private information that may be learned by tracking a recommendation model's sparse feature access patterns. We first characterize the types of attacks that can be carried out on sparse features in recommendation models in an untrusted cloud, followed by a demonstration of how each of these attacks leads to extracting users' private information or tracking users by their behavior over time.
translated by 谷歌翻译
How do we know when the predictions made by a classifier can be trusted? This is a fundamental problem that also has immense practical applicability, especially in safety-critical areas such as medicine and autonomous driving. The de facto approach of using the classifier's softmax outputs as a proxy for trustworthiness suffers from the over-confidence issue; while the most recent works incur problems such as additional retraining cost and accuracy versus trustworthiness trade-off. In this work, we argue that the trustworthiness of a classifier's prediction for a sample is highly associated with two factors: the sample's neighborhood information and the classifier's output. To combine the best of both worlds, we design a model-agnostic post-hoc approach NeighborAgg to leverage the two essential information via an adaptive neighborhood aggregation. Theoretically, we show that NeighborAgg is a generalized version of a one-hop graph convolutional network, inheriting the powerful modeling ability to capture the varying similarity between samples within each class. We also extend our approach to the closely related task of mislabel detection and provide a theoretical coverage guarantee to bound the false negative. Empirically, extensive experiments on image and tabular benchmarks verify our theory and suggest that NeighborAgg outperforms other methods, achieving state-of-the-art trustworthiness performance.
translated by 谷歌翻译
联合学习(FL)旨在对多个数据所有者持有的分布式数据执行隐私的机器学习。为此,FL要求数据所有者在本地执行培训,并与中央服务器共享梯度更新(而不是私人输入),然后将其安全地汇总在多个数据所有者上。尽管汇总本身并不能证明提供隐私保护,但先前的工作表明,如果批处理大小足够大,则足够了。在本文中,我们提出了鸡尾酒会攻击(CPA),与先前的信念相反,能够从汇总的渐变中恢复私人输入,这是批量较大的大小。 CPA利用了至关重要的见解,即来自完全连接的层的总梯度是其输入的线性组合,这使我们将梯度反演作为盲源分离(BSS)问题(非正式地称为鸡尾酒会问题)。我们适应独立的组件分析(ICA) - BSS问题的经典解决方案 - 恢复针对完全连接和卷积网络的私人输入,并表明CPA明显优于先前的梯度反转攻击,对成像网的输入量表,并表现出Imagenet大小的输入的范围最高可达1024的大批量。
translated by 谷歌翻译
最近,神经技术已用于自动生成源代码。这些方法在有望获得声明语言的同时,在命令式语言的数据集上的性能差得多。由于通常将声明性语言嵌入了现实世界软件开发中的命令式语言(即Turducken式编程)中,因此声明语言的有希望的结果几乎不会导致手动软件开发工作大幅减少。在本文中,我们定义了一项新的代码生成任务:鉴于自然语言评论,此任务旨在用嵌入式声明语言以基本命令性语言生成程序。据我们所知,这是第一个Turducken风格的代码生成任务。对于此任务,我们将Lyra:Python中的数据集提出了嵌入式SQL。该数据集包含来自现实世界项目的2,000个精心注释的数据库操作程序。每个程序都与中文评论和英文评论配对。在我们的实验中,我们采用了变压器,伯特风格和GPT风格的模型作为基础。在最佳环境中,GPT风格模型的生成性能比其他模型更好,在使用中文和英语评论时,AST精确匹配的精度分别为24%和25.5%。因此,我们认为Lyra为代码生成提供了新的挑战。但是,克服这一挑战可能会大大提高代码生成技术在现实世界软件开发中的适用性。
translated by 谷歌翻译
现有图形神经网络(GNNS)很大程度上依赖于节点嵌入品,其表示节点作为其标识,类型或内容的矢量。但是,具有未分配的节点的图表广泛存在于现实世界中的应用程序(例如,匿名社交网络)。以前的GNN可以将随机标签分配给节点(将伪影介绍给GNN)或分配给所有节点的一个嵌入(这不能明确区分一个节点)。此外,当这些GNN应用于未分配的节点分类问题时,它们具有不需要的标准性属性,其基本上无法以多种可能的输出来解决数据。在本文中,我们分析了节点分类问题现有方法的限制。灵感来自我们的分析,我们提出了一种推广的标准性质和优先标记技术,满足所需的属性渐近。实验结果表明,我们在几种未分配的节点分类任务中实现了高性能。
translated by 谷歌翻译
Given the increasingly intricate forms of partial differential equations (PDEs) in physics and related fields, computationally solving PDEs without analytic solutions inevitably suffers from the trade-off between accuracy and efficiency. Recent advances in neural operators, a kind of mesh-independent neural-network-based PDE solvers, have suggested the dawn of overcoming this challenge. In this emerging direction, Koopman neural operator (KNO) is a representative demonstration and outperforms other state-of-the-art alternatives in terms of accuracy and efficiency. Here we present KoopmanLab, a self-contained and user-friendly PyTorch module of the Koopman neural operator family for solving partial differential equations. Beyond the original version of KNO, we develop multiple new variants of KNO based on different neural network architectures to improve the general applicability of our module. These variants are validated by mesh-independent and long-term prediction experiments implemented on representative PDEs (e.g., the Navier-Stokes equation and the Bateman-Burgers equation) and ERA5 (i.e., one of the largest high-resolution data sets of global-scale climate fields). These demonstrations suggest the potential of KoopmanLab to be considered in diverse applications of partial differential equations.
translated by 谷歌翻译
Temporal sentence grounding (TSG) aims to identify the temporal boundary of a specific segment from an untrimmed video by a sentence query. All existing works first utilize a sparse sampling strategy to extract a fixed number of video frames and then conduct multi-modal interactions with query sentence for reasoning. However, we argue that these methods have overlooked two indispensable issues: 1) Boundary-bias: The annotated target segment generally refers to two specific frames as corresponding start and end timestamps. The video downsampling process may lose these two frames and take the adjacent irrelevant frames as new boundaries. 2) Reasoning-bias: Such incorrect new boundary frames also lead to the reasoning bias during frame-query interaction, reducing the generalization ability of model. To alleviate above limitations, in this paper, we propose a novel Siamese Sampling and Reasoning Network (SSRN) for TSG, which introduces a siamese sampling mechanism to generate additional contextual frames to enrich and refine the new boundaries. Specifically, a reasoning strategy is developed to learn the inter-relationship among these frames and generate soft labels on boundaries for more accurate frame-query reasoning. Such mechanism is also able to supplement the absent consecutive visual semantics to the sampled sparse frames for fine-grained activity understanding. Extensive experiments demonstrate the effectiveness of SSRN on three challenging datasets.
translated by 谷歌翻译
Time-series anomaly detection is an important task and has been widely applied in the industry. Since manual data annotation is expensive and inefficient, most applications adopt unsupervised anomaly detection methods, but the results are usually sub-optimal and unsatisfactory to end customers. Weak supervision is a promising paradigm for obtaining considerable labels in a low-cost way, which enables the customers to label data by writing heuristic rules rather than annotating each instance individually. However, in the time-series domain, it is hard for people to write reasonable labeling functions as the time-series data is numerically continuous and difficult to be understood. In this paper, we propose a Label-Efficient Interactive Time-Series Anomaly Detection (LEIAD) system, which enables a user to improve the results of unsupervised anomaly detection by performing only a small amount of interactions with the system. To achieve this goal, the system integrates weak supervision and active learning collaboratively while generating labeling functions automatically using only a few labeled data. All of these techniques are complementary and can promote each other in a reinforced manner. We conduct experiments on three time-series anomaly detection datasets, demonstrating that the proposed system is superior to existing solutions in both weak supervision and active learning areas. Also, the system has been tested in a real scenario in industry to show its practicality.
translated by 谷歌翻译
This paper investigates the use of artificial neural networks (ANNs) to solve differential equations (DEs) and the construction of the loss function which meets both differential equation and its initial/boundary condition of a certain DE. In section 2, the loss function is generalized to $n^\text{th}$ order ordinary differential equation(ODE). Other methods of construction are examined in Section 3 and applied to three different models to assess their effectiveness.
translated by 谷歌翻译
Recently, great progress has been made in single-image super-resolution (SISR) based on deep learning technology. However, the existing methods usually require a large computational cost. Meanwhile, the activation function will cause some features of the intermediate layer to be lost. Therefore, it is a challenge to make the model lightweight while reducing the impact of intermediate feature loss on the reconstruction quality. In this paper, we propose a Feature Interaction Weighted Hybrid Network (FIWHN) to alleviate the above problem. Specifically, FIWHN consists of a series of novel Wide-residual Distillation Interaction Blocks (WDIB) as the backbone, where every third WDIBs form a Feature shuffle Weighted Group (FSWG) by mutual information mixing and fusion. In addition, to mitigate the adverse effects of intermediate feature loss on the reconstruction results, we introduced a well-designed Wide Convolutional Residual Weighting (WCRW) and Wide Identical Residual Weighting (WIRW) units in WDIB, and effectively cross-fused features of different finenesses through a Wide-residual Distillation Connection (WRDC) framework and a Self-Calibrating Fusion (SCF) unit. Finally, to complement the global features lacking in the CNN model, we introduced the Transformer into our model and explored a new way of combining the CNN and Transformer. Extensive quantitative and qualitative experiments on low-level and high-level tasks show that our proposed FIWHN can achieve a good balance between performance and efficiency, and is more conducive to downstream tasks to solve problems in low-pixel scenarios.
translated by 谷歌翻译